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Abstract 



In Constructive Type Theory, recursive and corecursive definitions are subject to syntactic restrictions 
which guarantee termination for recursive functions and productivity for corecursive functions. However, 
^^. , many terminating and productive functions do not pass the syntactic tests. Bove proposed in her thesis 

i,^ , an elegant reformulation of the method of accessibility predicates that widens the range of terminative 

>«.i^ , recursive functions formalisable in Constructive Type Theory. In this paper, we pursue the same goal 

00 , for productive corecursive functions. Notably, our method of formalisation of coinductive definitions of 

^> ^ productive functions in Coq requires not only the use of ad-hoc predicates, but also a systematic algorithm 

^""^ ' that separates the inductive and coinductive parts of functions. 
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1 Introduction 

The proof assistant Coq [44] is an implementation of the Calculus of Inductive Con- 
structions [23] extended with inductive [40] and coinductive [31] types. Implementa- 
tions of coinductive types were first suggested by Coquand in [22] and implemented 
in Coq by Gimenez [31]. Coq has proved to be an effective tool for working with dif- 
ferent kinds of final coalgebras, such as the final coalgebra of streams [12], the final 
coalgebra of infinite binary trees [12], and some others. For instance, the machinery 
of Coq was used to define algebraic structures on real numbers [28,38,11]. 
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The specification language of Coq makes it possible to model the types and 
programs of typed programming languages. Typed programming languages usually 
provide a few basic types and a mechanism that allows the definition of inductive 
data types. When defining an inductive data type, we need to introduce constructors 
to generate the elements of the new type. A very well-known inductive data type 
is the type of natural numbers, defined using two constructors: and S. 

In Coq, one can define coinductive types in the same fashion as inductive types, 
using a few basic constructors that are also related to destructors through the 
pattern-matching construct. One can also use destructors when he wants to em- 
phasise the duality relative to inductive constructors. We will illustrate this in 
Section 2. 

A key notion in typed (functional) programming is the notion of recursion by 
which an object being defined refers to itself. Functions defined over inductive types 
are recursive by nature. E.g., most of the functions defined on natural numbers need 
to be defined recursively. In Section 2, we will discuss in more detail the syntax of 
such functions in Coq. 

Associated with recursion, there is a crucial question of termination. In general, 
there is no guarantee that a recursive function will always terminate. A solution 
to this problem is to use only structurally recursive definitions. A structurally 
recursive definition is such that every recursive call is performed on a structurally 
smaller argument. This guarantees that the recursion terminates. Constructive 
Type Theory in general, and Coq in particular, impose the structurally recursive 
condition on every defined recursive function. And thus, all functions are guaranteed 
to terminate in Coq, as we further explain in Sections 3 and 4. 

Definitions where the recursive calls are not required to be on structurally smaller 
arguments, that is, where the recursive calls can be performed on any argument are 
called general recursive definitions. Many important and well known algorithms 
are not structurally recursive but general recursive. Although many general recur- 
sive algorithms can be proved to terminate, there is no syntactic condition that 
guarantees their termination and thus, general recursive algorithms have no direct 
formalisation in Constructive Type Theory and in Coq. Several solutions to the 
problem of encoding general recursive functions have been suggested in [1,9,20]. 
Other methods include the use of accessibility predicates [2,4,39] and ad-hoc predi- 
cates [14], the two latter methods are surveyed in Section 4. Most of these methods 
ultimately rely on structural recursion. 

Already in the case of inductive types, there exists a difference between the 
class of functions satisfying the semantic condition of termination, and the class of 
functions satisfying the syntactic condition of structural recursion. It is significant 
that, when working with coinductive types in Coq, we find the notions of produc- 
tivity [24,43] and guardedness [22,31] dual to those of termination and structural 
recursion. This is analysed in Sections 3 and 4. Similarly to the inductive case, the 
guardedness condition bans Coq formalisations for many useful productive func- 
tions. This problem was tackled (e.g.) in [1,9,29,30] for type theory and in [36] for 
HOL. 

A particular application of an ad-hoc predicate for defining corecursive filter 
function on streams in Coq first appeared in [10], where it was used to formalise 
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Eratosthenes' sieve. Already in that example, the filter function was decomposed 
into inductive and coinductive components, in order to become guarded. A similar 
result, also for filter functions on streams, was described by M. Niqui [37,38]. 

However, neither [10] nor [37,38] included the systematic description of the 
method of separating inductive and coinductive components of productive values 
in general case. The general description of how the ad-hoc predicates can be con- 
structed for corecursive functions was missing, too. 

In this paper we venture to generalise the results obtained in [10], and describe 
the method of formalising productive (non-guarded) values in Coq for any given 
function and for any given data type. In Section 4 we describe the class of functions 
that are covered by this general method; and give counterexamples of functions that 
are not. 

In Sections 5 and 6, we give a general method of separating the inductive and 
coinductive components of productive coinductive values in Coq. In particular. 
Section 5 is devoted to the inductive component and gives general characterisation 
of the method of building ad-hoc predicates for formalising productive values in 
Coq. In Section 6, we characterise the coinductive component. 

Notably, there are two predicates, eventually and infinite, that are essential 
for characterisations of inductive and coinductive components, respectively. The 
similar "eventually" and "infinite" were first introduced as temporal modalities in 
[42], and their coalgebraic specification was given by Jacobs in [34]. We show how 
to formalise lemmas relating eventually and infinite, and use these lemmas to 
tie together the inductive and coinductive components of productive functions. 

In Section 7, we prove the "recursive equation lemmas" establishing that our 
formalisations of the productive functions are correct. Prior to this paper, such 
lemmas have never been established; in particular, they were missing in [10,37,38]. 

Finally, in Section 8 we conclude and outline the further work to be done. 

2 Inductive and Coinductive Types in Coq 

In this section, we will give a short exposition of how inductive and coinductive 
types are defined and used in Coq. We will introduce several running examples. 
The related work of developing the theory of corecursive definitions was done in 
HOL and mechanised using Isabelle [41]. For a more detailed introduction to Coq, 
see [12]. 

As we have already mentioned in the introduction, inductive data types are 
defined by introducing a few basic constructors that generate the elements of the 
new type. 

Definition 2.1 The definition of the inductive type of natural numbers is built 
using two constructors D and S: 

Inductive nat : Set := 
I D : nat 
I S : nat -> nat . 

After the inductive type is defined, one can define its inhabitants and functions 

3 
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on it. Most functions defined on the inductive type must be defined recursively, that 
is, by describing values for different patterns of the constructors and by allowing 
calls to the same function on variables taken from the patterns. Thus, division by 
two on natural numbers is computed by a recursive function div2: 

div2(0) = 

div2(S 0) = 

div2(S(S n')) = S(div2 n'). 

And this can be modelled in Coq as follows. 

Definition 2.2 

Fixpoint div2 n : nat := 
match n with 
I => 
I S => 

I S (S n') => S (div2 n') 
end. 

It is essential that recursive functions are defined over arguments of inductive 
types. For instance, natural numbers are given as arguments in the above definition. 

We can use the inductive type of natural numbers and the defined function div2 
to obtain a new recursive function. The function of discrete logarithm of base 2 for 
natural numbers is computed by a recursive function log satisfying the equation 

log(S 0) = 

log(S(Sn)) = S(logS(div2n)). 

But this, as we further explain in Section 4, cannot be modelled directly in Coq. 
However, we can define the inductive predicate characterising the arguments for 
which the discrete logarithm is well-defined: 

Definition 2.3 

Inductive log_domain : nat -> Prop := 
I log_domain_l : log_doniain 1 
I log_domain_2 : 
forall p: nat, log_domain (S (div2 p) ) -> log_domain (S (S p) ) . 

It was observed in [35], that induction gives rise to initial algebras, while coin- 
duction gives rise to final coalgebras; and the basic duality between algebras and 
coalgebras can be expressed as construction versus observation. Let us have a closer 
look at how this idea is realised in Coq. 

The following is the definition of a coinductive type of infinite streams, built 
using one constructor SCons. 

Definition 2.4 The type of streams is given by 

Coinductive str (A: Set) : Set := SCons: A -> str A -> str A. 

4 
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Typically, a stream has the form SCons a s, where a is an element of some set A, 
and s is a stream. There exists a common convention to write a : : s for SCons a 
s. Although syntactically the above definition is very similar to the definitions of the 
inductive types, this coinductive definition supports well the dichotomy construction 
versus observation: given an infinite stream, we can only observe its head, and pass 
on to its tail. The tail will be infinite, too; and only its first element can be observed 
next. 

Definition 2.5 The coinductive function repeat takes as argument an element a 
of some set A and yields a stream where a is repeated indefinitely: 

CoFixpoint repeat (a: A): str A := SCons a (repeat a). 

Notably, we do not have to impose any type requirements on arguments of the 
function, but we require the produced values to be of coinductive type. 

Properties of coinductive data often need to be expressed with coinductive pred- 
icates. To prove some properties of infinite streams, we use the method of observa- 
tion. For example, to prove that the two lists are bisimilar, we must observe that 
their first elements are the same, and continue the process with the next. 

Definition 2.6 Bisimilarity is expressed in the definition of the following coinduc- 
tive type: 

Coinductive bisimilar_s: str A -> str A -> Prop := 
Ibisim: forall (a : A) (s s' : str A), bisimilar_s s s' -> 

bisimilar_s (SCons A a s) (SCons A a s')- 

The definition of bisniilar_s corresponds to the conventional notion of bisim- 
ilarity as given, e.g. in [35]. Lemmas and theorems analogous to the coinductive 
proof principle of [35] are proved in Coq and can be found in [12]. 

Infinite streams are not the only kind of data that is handled by coinductive 
machinery of Coq. We can work with different types of infinite data types, such 
as infinite binary trees or infinite expression trees, see also [3]. Expression trees 
are trees in which every node has one or two children. The nodes of these trees are 
labelled with elements of sets A and B, and we will call them A-nodes and -B-nodes, 
respectively. We will denote expression trees by E(A, B). The expression trees were 
extensively used in formalising real number arithmetic, see [38,25]. 

Definition 2.7 We coinductively define the expression trees in Coq: 

Coinductive ETrees (A B : Set) : Set := 

I A_node : A -> ETrees A B -> ETrees A B 

I B_node : B -> ETrees A B -> ETrees A B -> ETrees A B. 

We define a bisimilarity relation bisimilar _t for this type in [13]. Also, in [13] 
we show that bisimilar _s and bisimilar_t are equivalence relations. 

We have seen that in Coq, inductive types are domains of recursive functions and 
coinductive types are codomains of corecursive functions. We have also observed 
that syntactically, the definitions of inductive and coinductive types in Coq follow 
one and the same scheme. 
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3 Termination and Productivity 

In this section, we will discuss two computational concepts that depend on the re- 
cursive nature of inductive and coinductive definitions, and those are of termination 
and productivity. 

In Coq, as in any other type-theoretic theorem provers (HOL, PVS, and others, 
see [7]), all computations must terminate. Because propositions are represented by 
types and proofs by programs, according to Curry-Howard isomorphism [6,32,27], 
we cannot allow non-terminating proofs, as they may lead to inconsistency. There 
is a technical reason for the termination requirement, too: to decide type-checking 
of dependent types, we need to reduce type expressions to normal form ([7] is a very 
good survey of proof techniques used in type theory). 

Example 3.1 The function div2 from Definition 2.2 is terminative. 

Given any natural number excluding as an input, the function log described 
in Section 2 is terminative. 

For corecursive functions there is a dual notion to that of termination - produc- 
tivity. The notion of termination is used to ensure totality of functions on finite 
objects (initial algebras [35]); while productivity is used to ensure totality of func- 
tions on infinite objects (final coalgebras). The infinite objects that we are going 
to use as running examples through the paper are streams and expression trees as 
defined in 2.4 and 2.7. 

The notion of productivity was first defined in [24,43], in terms of domain theory. 
For a very careful domain theoretic characterisation of productivity of streams and 
trees, see [37]. However, we will omit the domain theoretic definitions here, and 
describe productivity from a computational point of view. Namely, we use recursive 
functions in order to define classes of productive functions in Coq. We hope that 
this section will give the reader the opportunity to capture the spirit of a functional 
approach to productivity. For more on productivity of infinite data structures, see, 
e.g. [19,22,26]. 

Values in co-inductive types usually cannot be observed as a whole, because of 
their infiniteness. Instead, they are often described as some finite tree-like struc- 
tures where some sub-terms still remain to be computed and are described using 
unevaluated functions applied to arguments. Values in co-inductive types are said 
to be productive when all observations of fragments made using recursive functions 
are guaranteed to be computable in finite time. 

When the co-inductive type being considered is the type of streams, we can ask 
to see the element of the stream at position n using the following function: 

Definition 3.2 

nth (SCons a tl) = a 

nth (S n) (SCons a tl) = nth n tl 

It is a regular structural recursive function with structural argument n. 

A given stream s is productive if we can be sure that the computation of the 

6 
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list nth n s is guaranteed to terminate, whatever the value of n is (satisfying the 
condition for the function nth will be enough to ensure that it is satisfied for any 
other recursive function, but this is hard to prove). 

Example 3.3 For any n, the value repeat n is productive, (see also Definition 
2.5). 

We can do the similar recursive observations on coinductive expression trees. We 
describe the observation of values at different positions in an expression tree using 
a type of direction direct and a function fetch that takes sequences of directions 
to access a given position in a tree. 

Definition 3.4 Assuming the type direct is given by the following: 
Inductive direct : Type := L | R, the function fetch of the type 
forall A B:Set, list direct -> ETrees A B -> A+B is defined as follows: 

fetch nil (Ajiode a t) = inl a 

fetch (_::tl) (Ajiode a t) = fetch tl t 

fetch nil (Bjiode b tl t2) = inr b 

fetch (L::tl) (Bjiode b tl t2) = fetch tl tl 

fetch (R::tl) (Bjiode b tl t2) = fetch tl t2 

An expression tree t is said to be productive if the computation of fetch 1 t 
is guaranteed to terminate whatever the value of 1 is. 

We call a function productive at the input value i, if it outputs a productive 
value at i. This understanding of productivity of functions differs slightly from 
the approach of [43,37,38], where a function is said to be productive if it takes 
productive values as an input and outputs productive values. Let us explain this 
on the following examples. 

The function dyn defined below takes an arbitrary value as an input. It returns 
productive values only for some inputs. 

Definition 3.5 Let A, B be of type Set. For a predicate P : B ^ bool and 
functions h : B ^ A, g, g' : B ^ B, we define the function dyn as follows: 

I SCons h{x) (dyn {g{x))) if P{x) 
dyn (x) 

dyn {g'{x)) otherwise. 

Thus, unlike [37,38], we do not require productive functions to take coinductive 
values as an input. The only requirement we impose is that the produced data must 
be coinductive. This approach is consistent with the understanding of coinductive 
functions in Coq: arguments of corecursive functions can be of any type, and only 
the produced values are required to be of coinductive type. 

There is a tradition of studying productive functions, probably meaning that 
these functions are totally productive once given productive values as arguments. 
However, in this work we want to study functions that are only partially productive, 
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that is, functions that will return productive values only for a subset of their in- 
put type, a subset which we characterise precisely using inductive and coinductive 
predicates. We illustrate this further as follows. 

Consider the filter function on streams that was formalised in [10] and was used 
to filter prime numbers. 

Definition 3.6 (Filter for streams). 

For a given predicate P, 

\ SCons X (filter tl) if P{x) 
filter (SCons x tl) 

filter tl otherwise. 

The filter function examines the first element x of a given list (SCons x tl) for 
a property P, and, in case the property is satisfied, it uses x to form a new list. 
Then it recursively examines the tail of the stream. 

In general, filter functions on streams make it possible to define non-productive 
values. The filter function above can be non-productive on certain values and for 
certain predicates. 

Example 3.7 For instance, computing nth (filter even (repeat 1)) pro- 
vokes the following computation: 

filter even (repeat 1) repeat 1 -w filter even (l::repeat 1) -^ 
filter even (repeat 1)... 

The first arrow comes from computing a little portion of (repeat 1). The second 
arrow comes from observing that 1 is not even and reducing the definition of filter. 
This leads to the same position as initially without producing the first element of 
the stream required by (nth 0). The same computation should be triggered again 
and indefinitely. 

The method of formalising corecursive functions that we propose in this paper, 
makes it possible to formalise such functions in Coq, using inductive and coinductive 
predicates to characterise the arguments on which these functions output productive 
values. By abuse of terminology that causes no confusion, we will call these functions 
"productive" . 

We conclude this section with another important example of a partially produc- 
tive corecursive function. The following functions generalise the filter from Defini- 
tion 3.6 to the case of expression trees with dynamic filtering: 

Definition 3.8 (Dynamic Filter for Expression Trees). 

Let P and Pi be predicates, and let hi : Ai ^ A2, h2 : Bi ^ A2, h'l : Ai — > yli, 
h'2 '■ B\ ^f Bi, h3 : ETrees AiBi -^ ETrees AiBi. The latter functions will 
"dynamically" change the filtered values in the process of filtering: 



A_node hi{a) (e_filter (A_node h[{a) h)) if P(a) 
e_filter (A_node a ti) = <^ 

e_filter h^ (A_node a ti) otherwise. 
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e_filter (B_node b ti ^2) 



A_node /i2(6) (e_filter (B_node h':^{b) ti ta)) if Pi{b) 
e_filter /13 (B_node b ti ^2) otherwise. 



The function was used to estabhsh a normahsation algorithm for an admissible 
representation of a closed interval of real numbers in [27,37]. 

We have discussed the reasons why termination and productivity are important 
for our theory, and also we showed how they relate to each other. We introduced 
important examples of partially productive functions. In the next section, we will 
discuss the way how terminative and productive functions are syntactically defined 
in Coq. 

4 Structural Recursion and Guardedness; Method of 
Ad-hoc Predicates 

There are two syntactic tests that ensure termination and productivity of functions 
in Coq, they are called structurally smaller calls condition [21,40] and guarded-by- 
constructors condition [31]. 

A structurally recursive definition is such that every recursive call is performed 
on a structurally smaller argument. In this way we can be sure that the recursion 
terminates. 

Guardedness is a sufficient condition for productivity. It is described in two 
steps. The first step defines pre-guarded positions. A position is pre-guarded if it 
occurs as the root of the function body, or if it is a direct sub-term of a pattern- 
matching construct or a conditional statement, which is itself in a pre-guarded 
position. 

The second step defines guarded positions. A position is guarded if it occurs as 
a direct sub-term of a constructor for the co-inductive type that is being defined 
and if this constructor occurs in a pre-guarded position or a guarded position. A 
corecursive function is guarded if all its corecursive calls occur in guarded positions. 

Guardedness ensures that at least one constructor of the co-inductive type is 
produced for each corecursive call, and thus at least a fragment of corecursive data 
is produced each time a corecursive call occurs. 

Example 4.1 The function div2 from Definition 2.2 is structurally recursive. 
The function repeat from Definition 2.5 is guarded. 

Definitions where the recursive calls are not required to be on structurally smaller 
arguments, that is, where the recursive calls can be performed on any argument, 
are called general recursive. Many important algorithms, such as the algorithm 
of computing logarithm discussed in Section 2, are not structurally recursive but 
general recursive. 

The standard way of handling general recursion in constructive type theory uses 
a well-founded recursion principle derived from the accessibility predicate Ace [2,39]. 
The idea behind the accessibility predicate is that an element a is accessible by a 

9 
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relation < if there is no infinite decreasing sequence starting from a. A set A is said 
to be well-founded with respect to < if all its elements are accessible by <. Hence, 
to guarantee that a general recursive algorithm that performs the recursive calls on 
elements of type A terminates, we have to prove that A is well-founded and that 
the arguments supplied to the recursive calls are smaller than the input. 

In Coq, the method was implemented in [33,4], see also [12]. The method of 
using accessibility predicates was improved by Bove in her thesis [15] and series of 
papers [14,17,16,18]. The core of the improvement proposed by Bove was to separate 
computational and logical parts of the definitions of general recursive algorithms. 
That is, the method amounts to defining an inductive special-purpose accessibility 
(ad-hoc) predicate that characterises the inputs on which the algorithm terminates. 
Proving that a certain function is total amounts to proving that the corresponding 
accessibility predicate is satisfied on every input. 

Example 4.2 We continue Definition 2.3, and formalise the function log defined in 
Section 2. We need an inversion lemma about the predicate log_domain expressing 
that when x has the form "S (S p)", if x is in the domain, then "S (div2 p)" also 
is: 

Lemma log_domain_inv : 

forall X p : nat, log_domain x -> x = S(S p) -> 
log_domain (S (div2 p)). 

At each recursive call of the function, we need to use the above inversion lemma 
stating that the proof argument for recursive call can be deduced from the initial 
proof argument. 

We also need to express that is not in the domain of the function: 
Lemma log_domain_non_0 : forall x mat, log_domain x -> x 7^ 0. 

Now we can use the ad-hoc predicate log_domain_inv, together with the function 
div2 from Definition 2.2 and define the function log as a usual structurally recursive 
function, but where the structural argument is the proof argument] we box the proof 
arguments in the example below. 



Fixpoint log (x:nat) (h: log_domain x) 
match X as y return x = y -> nat with 
I => fun h' => False_rec nat 



struct h : nat := 



(log_domain_non_0 x h hO 



(log_domain_inv x p h hO) 



I S => fun h' => 
I S (S p) => 

fun h' => S (log (S (div2 p) ) 

end (ref l_equal x) . 

It is important that the Coq checker can recognise log_domain_inv x p h h' as a 
structurally smaller proof with respect to h. See also [12], Section 15.4. 

However, the source of our interest in this method of ad-hoc predicates lies not 
in general recursive functions, but in productive non-guarded corecursive functions. 
In the same way as the syntactic structurally recursive condition removes many 
useful terminative functions from the picture, the syntactic guardedness condition 

10 



Bertot and Komendantskaya 

of Coq rejects many productive functions. 

Informally speaking, the guardedness condition insures that 

* each corecursive call is made under at least one constructor; 

** if the recursive call is under a constructor, it does not appear as an argument of 
any function. 

Violation of any of these two conditions makes a function rejected by the guard- 
edness test in Coq. 

Example 4.3 The corecursive functions from Definitions 3.6, 3.8 and 3.5 are not 
directly formalisable in Coq, because they do not satisfy the guardedness condition 
*. However, we know about many useful examples of filters from Definitions 3.6 
and 3.8 that are productive, [10,38,25]. 

All the examples we have given so far exhibit partially productive non-guarded 
functions. And this may give the reader an impression that it is "partiality" of 
corecursive functions that makes them non-guarded. However, totally productive 
functions can be non-guarded, too. Moreover, we claim that every terminative 
function gives rise to a non-guarded totally productive function, or, in other words, 
one can always craft a totally productive non-guarded function using a terminative 
function. 

Example 4.4 For example, terminative function x — 1 gives rise to the totally 
productive function f : str nat -> str nat : 

x: :f (v: :tl) if a; < w 

f (x::y::tl) • ' 

f ((x-1) : :y: :tl) otherwise. 



This second corecursive call does not satisfy the guardedness condition 



* 



In general, the fact that one uses a terminative function to define a corecursive 
function guarantees productiveness, but in the same time one can always craft the 
resulting corecursive function in such a way that a corecursive call will not be under 
a constructor, thus violating guardedness condition *. 

The next example illustrates the class of functions that fail to satisfy the guard- 
edness condition **. 

Example 4.5 Consider the following function computing lists of ordered natural 
numbers: 

nats = (SCons 1 (map (+ 1) nats)). 

where the function map above is defined as follows: 

map f (s: str): str := Cons (f (hd s)) (map f (tl s)). 

That is, we start with the list with 1, then add (+ 1) to the head of the list to get 
the second element 2; and continue the same computation on the tail. 

The recursive call here is made under the constructor SConc, but it also appears 
on the argument place of the function map. The latter fact violates the guardedness 

11 



Bertot and Komendantskaya 

condition **, and hence the function will be rejected by Coq. Despite of this, the 
value nats is known to be productive. 

The similar problem arises with the Example 1 in [26]: the function is productive 
but non-guarded. 

In [10] was found a Coq formalisation of a coinductive filter function from Def- 
inition 3.6. The work was aimed at showing that values produced through filter 
functions could still be described as guarded corecursive functions. Notably, the 
solution involved the method of building an ad-hoc predicate similar to the ad-hoc 
predicate of Bove. The fact that the method introduced for inductive algorithms 
is expandable to coinductive ones is significant in its own right. But the example 
of [10] exhibited even more: the method of formalising corecursive filter functions 
requires separation of not only "logical" and "computational" parts of algorithms, 
but also of inductive and coinductive parts. 

In the remaining sections, we develop the general method of formalising produc- 
tive functions that fail to satisfy the guardedness condition *. Our method does not 
cover the class of productive functions that do not satisfy the guardedness condition 
**. Such functions were studied (e.g.) in [1,26]. 

5 Inductive Components of Corecursive Functions 

For each productive function, we will describe two modalities, eventually and 
infinite that we are going to use when characterising the inductive and coinductive 
components of productive functions. These are variations of O and □ specified 
coalgebraically in [34], and they originate from the temporal modalities introduced 
in [42]. 

The conventional definitions of eventually and infinite as formalised, (e.g.) 
in [12], express whether a given stream s satisfies some given property P at least 
once or infinitely many times. We modify these predicates in order to characterise 
the conditions for a corecursive function to perform a guarded corecursive step. In 
this section, we consider eventually and its role in defining a recursive function 
which characterises the inductive component of a given corecursive function. 

We propose to take as a starting point the defining equation of the corecursive 
function we wish to formalise; as e.g.. Definitions 3.6, 3.8, 3.5. Then we define 
predicates on the input types of these functions. The predicate eventually captures 
the conditions for the corecursive function to perform the next guarded corecursive 
step. For any given function, and on any given data type, we can define eventually 
systematically as follows: 

1. The predicate eventually is defined inductively, with one constructor for 
each branch appearing in the function definition. 

For example, we will notice that in Definitions 5.2 and 5.3 the number of con- 
structors we use to define eventually will vary from two to four, according to the 
number of branches appearing in Definitions 3.6, 3.8. 

2. When a branch contains only guarded recursive calls, the constructor ex- 
presses that the input data satisfies the eventually predicate as soon as it satisfies 
all the conditions needed to reach this branch. 

12 
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For instance, the dyn function from Definition 3.5 performs a boolean test on 
the predicate P, and returns the value SCons (h x) (dyn (g x) ) if (P x) is true, 
or the value (dyn (g' x)) if (P x) is false. Each of these values contains a 
recursive call, but the first call on (g x) is guarded while the other call on (g' x) 
is non-guarded. 

For the first recursive call, we can directly provide a constructor to this inductive 
predicate that states that x satisfies the predicate if (P x) is true. (We choose to 
name the eventually predicate eventually_dyn.) 

ev_dynl : P x = true -> eventually_dyn x 

3. When a branch contains non-guarded recursive calls, the constructor expresses 
that the input data satisfies the predicate as soon as it satisfies all the conditions 
leading to this branch, and the inputs to all non-guarded recursive calls satisfy the 
predicate. 

Let us return to the dyn function. For the second recursive call, we provide a 
constructor stating that x satisfies the predicate when (P x) is false only if the 
recursive call would then reach a value that already satisfies the predicate: 

ev_dyn2 : P x = false -> eventually_dyn (g' x) -> eventually_dyn x 

We have covered all possible recursive branches in the behaviour of dyn, so we 
can collect the constructors in an inductive definition: 

Definition 5.1 

Inductive eventually_dyn (x:B) : Prop := 

I ev_dynl : P x = true -> eventually_dyn x 

I ev_dyn2 : P x = false -> eventually_dyn (g' x) -> eventually_dyn x. 

The same systematic approach gives the predicates eventually_s for the filter 
on streams and eventually_t for the filter on expression trees: 

Definition 5.2 

Inductive eventually_s : str A -> Prop := 

I ev_b: forall x s, P x -> eventually_s (SCons A x s) 

I ev_r: forall x s, 

~ P X -> eventually_s s -> eventually_s (SCons A x s) . 

Note that eventually_t has four constructors, depending on whether the input 
data is an A-node or a B-node and depending on whether the observable data carried 
by this node satisfies a predicate P or not, precisely as in Definition 3.8: 

Definition 5.3 Inductive eventually_t : ETrees Al Bl -> Prop : = 
|ev_rB: forall x t tl , 

~P1 X -> eventually_t (h3 (B_node Al Bl x t tl)) -> 

eventually_t (B_node Al Bl x t tl) 
|ev_bB: forall x t tl, PI x -> eventually_t (B_node Al Bl x t tl) 
|ev_rA: forall x t, 

~P X -> eventually_t (h3 (A_node Al Bl x t)) -> 

eventually_t (A_node Al Bl x t) 
|ev_bA: forall x t, P x -> eventually_t (A_node Al Bl x t) . 

13 
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Thus, eventually_dyn, eventually.t and eventually_s are constructed using 
one systematic method. 

The eventually predicates are satisfied if the conditions for the recursive func- 
tion to perform a guarded corecursive call are satisfied at least once. In the future, 
we will say that eventually ensures only the first-step-productivity, as opposed to 
the conventional productivity of Section 3. 

Using the inductive definition of eventually predicates, we can describe the 
first component of the function we want to compute. This component is a recursive 
function that performs all the computations and tests that lead to the first guarded 
corecursive call. For lack of a better name, we will call this function the inductive 
component of the corecursive function being defined. Separating this inductive 
component from the rest of the function behaviour is interesting: it will hide all the 
non-guarded corecursive calls and make it possible to go directly from one guarded 
call to the next one. 

This inductive component is modelled by following the technique of ad-hoc pred- 
icates where the chosen predicate is the eventually predicate. 

The method of ad-hoc predicates relies on inversion lemmas. Each inversion 
lemma expresses that the argument to the recursive call satisfies the eventually 
predicate if the initial argument does. Moreover, the proof of each inversion lemma 
is carefully crafted to make sure that the lemma's conclusion is understood as a 
structurally smaller proof with respect to one of the lemma's premises (this is the 
essence of recursion on an ad-hoc predicate, see [12]). 

There is one inversion lemma needed for the dyn function and the filter on 
streams. Two inversion lemmas are needed to formalise the filter on trees. In 
general, the number of the required inversion lemmas equals to the number of non- 
guarded calls in the initial corecursive function. 

Lemma 5.4 (Inversion Lemmas) Lemma eventually_dyn_inv : 
forall X, eventually_dyn x -> P x = false -> eventually_dyn (g' x) . 

Lemma eventually_s_inv: forall (s : str A), 
eventually_s s -> forall x s', s = SCons A x s' -> 

~ P X -> eventually_s s ' . 

Lemma eventually_t_inv: 
forall t : ETrees Al Bl, eventually_t t -> 
forall (x: Al) (f: ETrees Al Bl) , t = (A_node Al Bl x tO -> 

~ P X -> eventually_t (h3 (A_node Al Bl x t')). 

Lemma eventually_t_inv' : 
forall t : ETrees Al Bl, eventually_t t -> 
forall(x:Bl) (f tl: ETrees Al Bl) , t = (B_node Al Bl x t' tl) -> 

~P1 X -> eventually_t (h3 (B_node Al Bl x t' tl)). 

We give the full proofs of these inversion lemmas in [13]. 

We use these inversion lemmas to define the inductive component of corecursive 
functions: this inductive component does all computations until it performs the first 
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guarded corecursive call. For instance, for the function on streams, the inductive 
component performs all computation steps until one reaches the first element in the 
stream that satisfies the property P used for filtering. 

The data produced by the inductive component is organised in two parts: the 
first part contains the observable data that is included in the head constructor of 
the output. Such a head constructor is necessarily present, because the eventually 
predicate holds and this means that there is a guarded corecursive call. The second 
part contains the argument for the next recursive call of the initial corecursive 
function. 

The inductive component can perform recursive calls to itself, but they go from 
values satisfying the eventually predicate to values satisfying the eventually 
predicate, thanks to the inversion lemmas. 

Definition 5.5 The inductive component of the recursive function for dyn is called 
pre_dyn; the additional (proof) arguments are boxed below: 



Fixpoint pre_dyn (x:B) 



(d:eventually_dyn x) 



{struct d} : A*B := 



(eventually_dyn_inv x d t) 



match P X as b return P x = b -> A*B with 
true => fun t => (h x, g x) 

I false => fun t => pre_dyn (g' x) 

end (refl_equal (P x)). 

This code exhibits two aspects of programming with ad-hoc predicates, which make 
them complicated to read for neophytes: first, the function has an extra proof 
argument and recursive calls must also have this extra proof argument. Second, 
the body of the function contains a pattern-matching construction that must be 
applied to a proof of P x = P x. Inside the proof matching construct, this equality 
is transformed differently in the two branches: in one branch it takes the form of P 
X = true while in the other it takes of the form of P x = false. 

For the function e_f liter, the inductive component is named pre_f ilter_t. 

Definition 5.6 Fixpoint pre_filter_t (t : ETrees Al Bl) 
(h: eventually_t t) {struct h} : A2 * ETrees Al Bl := 
match t as b return t = b -> A2*ETrees Al Bl with 
|A_node x t' => fun heq : t = (A_node Al Bl x t') => 

match P_dec x with 

Heft _ => (hi (inl Bl x) , A_node (fst (h2 (inl Bl x) ) ) t') 

I right hn => pre_filter_t (h3 (A_node Al Bl x t')) 

(eventually_t_inv t h x t' heq hn) 

end 
|B_node x t' tl => fun heq': t = (@B_node Al Bl x t' tl) => 

match Pl_dec x with 

Heft _ => (hi (inr Al x) , B_node (snd (h2 (inr Al x))) t' tl) 

I right hn' => pre_f ilter_t(h3 (B_node Al Bl x t' tl)) 

(eventually_t_inv' t h x t' tl heq' hn') 

end 
end (ref l_equal t) . 
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We formulate the omitted inductive component of the filter for streams in [13]. 

In this section, we have shown the algorithm to formalise the inductive compo- 
nents of several recursive functions. It remains to describe how computation can 
carry on beyond the first guarded corecursive call. 

6 Coinductive Components of Corecursive Functions 

In this section we characterise the coinductive part of corecursive functions. Core- 
cursive computations are introduced by repeating computations performed by the 
inductive component defined in the previous section. Now, because the induc- 
tive component finds guarded corecursive call, guarded co-recursion can take place. 
However, this can only happen if recursive calls satisfy the eventually predicate 
repeatedly. We need an extra predicate to express this. 

We define a coinductive predicate infinite, a variation of which was denoted 
by □ in the examples of coalgebraic specifications by Jacobs [34] . It is notable that 
M. Niqui, who suggested a similar formalisation of filter functions for streams, used 
a predicate is_inf inite? in order to extract the desired guarded filter, see [37]. 

We suggest to define infinite predicates co-inductively by expressing that a 
value satisfies this predicate if it satisfies the eventually predicate and the second 
part computed by the inductive component satisfies the infinite predicate again. 
Remember that this second part is the argument for the next recursive call of the 
main corecursive function. 

Definition 6.1 For the dyn function, we choose to call the infinite predicate 
inf inite_dyn. It is described as follows: 

Coinductive infinite_dyn (x : B) : Prop := 
di : forall d: eventually_dyn x, 
infinite_dyn (snd (pre_dyn x d) ) -> infinite_dyn x. 

For the f ilter_s function, we choose to call the predicate inf inite_s: 

Coinductive infinite_s : str -> Prop := 

al_cons : forall (s:str A) (h: eventually s) , 
infinite_s (snd(pre_f ilter_s s h)) -> infinite_s s. 

For the e_f ilter function we choose to call the predicate inf inite_t: 

Coinductive infinite_t : ETrees Al Bl -> Prop := 
cf: forall (t: ETrees Al Bl) (h:eventually_t t) , 

infinite_t (snd (pre_f ilter_t t h)) -> infinite_t t. 

Notice that the shape of infinite predicates for all these functions is almost 
identical. 

The infinite predicate describes exactly those arguments to the function for 
which the function is guaranteed to be productive. Thus, we will reproduce the 
scheme already followed in recursion with respect to an ad-hoc predicate: the recur- 
sive function will be modelled in Coq by a function that receives an extra argument, 
a proof that the initial argument satisfies the infinite predicate. Computation will 
be performed by repeatedly calling the inductive component of the function. Since 
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the inductive component function can only compute on arguments that satisfy the 
eventually predicate, we need a lemma stating that the infinite predicate im- 
plies the eventually predicate. This lemma is always obtained using a very simple 
proof by pattern matching. 

For the dyn and f ilter_t functions the lemma has the following statements: 

Lemma 6.2 

Lemma inf inite_eventually_dyn : 

forall X, infinite_dyn x -> eventually_dyn x. 

Lemma inf inite_eventually_t : 

forall t, infinite_t t -> eventually_t t. 

After the computation of the first component, computation must go on, with 
recursive calls obtained from the second part of the data computed in the inductive 
component. However, this data must also be associated with a proof that it satisfies 
the infinite predicate. This is expressed by means of the following lemmas, given 
for dyn and e_f ilter: 

Lemma 6.3 

Lemma inf inite_always_dyn : 

forall X, infinite_dyn x -> forall e: eventually_dyn x, 
infinite_dyn (snd (pre_dyn x e)). 

Lemma inf inite_always_t : forall(t : ETrees Al Bl) (h: inf inite_t t) , 
infinite_t (snd (pre_f ilter_t t (inf inite_eventually_t t h))). 

We give the proofs of these lemmas in [13]. 

Finally, we use these two categories of lemmas to formalise the main functions 
from Definitions 3.5, 3.8 as guarded corecursive functions. 

Definition 6.4 For the dyn function the definition has the following shape (proof 
arguments are boxed): 



CoFixpoint dyn (x:B) 
SCons (fst (pre_dyn x 
(dyn . 



(h: inf inite_dyn x) 



str : = 
(inf inite_eventually_dyn ev x h) )) 



(inf inite_always_dyn x h (inf inite_eventually_dyn x h)) 



For the e_f ilter function, the definition has the following shape: 

CoFixpoint e_f ilter (t:ETrees Al Bl)(h: infinite_t t) 

: ETrees A2 B2 := 
match t return infinite_t t -> ETrees A2 B2 with 
I A_node x t' => fun h' : infinite_t (A_node x tO => 

A_node (fst (pre_f ilter_t t (inf inite_eventually_t h'))) 
(e_f ilter _ (inf inite_always h')) 
I B_node x t' t2 => fun h' : infinite_t (B_node x t' t2) => 
A_node (fst (pre_f ilter_t t (inf inite_eventually_t h'))) 
(e_f ilter _ (inf inite_always h')) 

17 



Bertot and Komendantskaya 

end. 

The guardedness of the functions is simply achieved by the presence of con- 
structors SCons or A_node. The recursive caUs receive the extra proof arguments. 
Notice that we don't need to give exphcitly the value on which the recursive call 
is performed, because this value can be inferred from the second proof argument. 
This explains why we used the anonymous value place holder _ in several places. 

We formalise the filter from Definition 3.6 in [13]. 

In this Section, we have given a systematic characterisation of the coinductive 
components of productive functions. We employed the predicate infinite to obtain 
the guarded formalisation of productive functions in Coq. It remains to provide tools 
to reason on the functions we modelled. 

7 Proving Properties About the Models 

In this section, we give the proofs of recursive equation lemmas, statements that 
express that the functions we obtain in Definition 6.4 really are good models of the 
non-guarded functions from Definitions 3.5, 3.6, 3.8. We discuss how such proofs 
can be obtained in a systematic way, outlining a few useful lemmas about these 
functions. 

The inductive component and coinductive component for each corecursive func- 
tion are modelled by Coq functions which take both a regular argument (named x in 
the function dyn and t in the function e_f ilter) and a proof about this argument 
(named h in both dyn and e_f ilter). Of course, we expect the result to depend 
on the regular argument, but we do not expect the computation to depend on the 
value of the proof. We only expect the computation to depend on the existence of 
this proof. In other words, when receiving a given argument x and two different 
proofs el and e2 that x satisfies the inf inite_dyn predicate, we expect the result- 
ing values dyn x el and dyn x e2 to be the same. This needs to be expressed in 
lemmas which we call proof-irrelevance lemmas. 

Lemma 7.1 (Proof-irrelevance Lemmas for dyn) . 

For the dyn function the proof irrelevance lemmas have the following shape: 

Lemma pre_dyn_prf _irrelevant : 
forall X el e2, pre_dyn x el = pre_dyn x e2. 

Lemma dyn_prf _irrelevant : 

forall X x' (i: infinite_dyn x) (i' : inf inite_dyn x'), x = x' -> 
bisimilar_s (dyn x i) (dyn x' i')- 

The proofs of these lemmas are given in [13]. Also, in [13] we formulate and prove 
the similar proof irrelevance lemmas for filters on streams and expression trees. A 
similar proof irrelevance lemma was used in [38] ; a very good study of applications 
of the Principle of Proof Irrelevance in type theory can be found in [8]. 

Note that the above proof irrelevance lemma, as well as analogous lemmas for 
filters on streams and trees [13], uses the "bisimilar" relation from Definition 2.6 
instead of the equality relation. 
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In coalgebra, one has the coinductive proof principle which states that bisimilar 
objects are equivalent, [35]. The constructive theory distinguishes equahty and 
bisimilar ity; and in many cases we can only give constructive proofs of bisimilarity 
of infinite objects; but not of their equality. 

Nevertheless, for all practical purposes, bisimilar objects can be used like equal 
objects. For example, one can prove in Coq that equality implies bisimilarity; or 
that for finite objects bisimilarity implies equality. Lemmas and theorems relating 
bisimilarity and equality can be found in [12]. 

It remains to show that the functions dyn, filter and e_f ilter modelled in 
Coq actually perform the computation that was initially intended. The main idea 
is to perform as in [4,5] and to prove a recursive equation. However, we have to 
cope with the last difficulty. The models we obtain have an extra argument, which 
is used to restrict the function to the inputs on which they really are productive. 

We circumvent the difficulty by integrating extra proof arguments in the for- 
mulation of the recursive equation. As a consequence, we need a few more lemmas 
expressing that all recursive calls that happen in the recursive equation are made on 
values that satisfy the infinite predicate, as soon as the initial argument already 
satisfies this predicate. These lemmas will be called step-lemmas. 

Example 7.2 For instance, the specification of dyn imposes that there should be a 
recursive call dyn (g' x) as soon as P x = false. Thus, we need to have a lemma 
that states that (g' x) satisfies inf inite_dyn as soon as x does and P x = false. 

Lemma 7.3 (Step-lemmas for dyn) . 

For the function dyn, there are two step lemmas: 

Lemma dyn_stepl : 

forall X, P X = true -> infinite_dyn x -> infinite_dyn (g x) . 

Lemma dyn_step2 : 

forall X, P X = false -> inf inite_dyn x -> inf inite_dyn (g' x) . 

We formalise the proofs for these lemmas in [13]. The first lemma expresses that 
the recursive call on g x is legitimate when P x is true, while the second lemma 
expresses that the recursive call on g' x is legitimate when P x is false. 

We prove the similar step-lemmas for filters on expression trees and streams in 
[13]. They are all formulated uniformly with the step-lemmas above. For expression 
trees, for example, we need four step lemmas, to give an account both for the cases 
when P X is true or false, and for the two constructors used to define expression 
trees. Conceptually, all these lemmas can be formulated in a uniform way. 

We can then formulate the recursive equation for the functions we modelled. 
This recursive equation is not as easy to read as the initial description, because 
proof arguments have been added in many places, but if we ignore these proof 
arguments, the initial intent appears clearly. 

For the dyn function, the equation lemma has the following shape, where bisim- 
ilarity is again used instead of equality. To help the reader to distinguish proof 
content from algorithmic content, we boxed the proof arguments. 

Theorem 7.4 (Recursive Equation Lemma for dyn) . 
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Theorem dyii_ equation : 

, bisimilar_s (dyn x 



forall X 
(match P X 

I true => 

I false => 

end 



as b return P x = b -> infinite_dyn x -> str 



with 



fun t i 



fun t i => 



(refl_equal (P x)) i 



=> SCons (h x) (dyn (g x) 
dyn (g' x) 
). 



(dyn_stepl x t i) 



(dyn_step2 x t i) 



We prove this lemma in [13], and give similar proofs for e_filter and filter. 

In [10], recursive equations were not given or proved: the properties of interest 
for the filter output were directly stated and proved, culminating with the proof that 
the infinite stream of prime integers could be obtained by repetitively filtering from 
the infinite stream of natural numbers. In this paper, we take a more systematic 
approach, with the hope that all steps can be automated. 

8 Conclusions and Further Work 

In this paper, we revisited the technique developed in [10] to model filter functions 
on streams in a type-theory with coinductive types and guarded corecursion. We 
showed that the same technique could be applied to a wide class of functions: 
firstly, we showed that it could be applied even if the output data-type was not a 
stream type, and secondly, we showed that the input data does not have to be of 
coinductive type. In the process, we delineated the various steps of the description 
and we showed that the technique can lead to the main theorem stating that the 
model exhibits the expected behaviour. 

In practice, this work extends the expressive power of guarded co-recursion by 
showing that some class of totally and partially productive corecursive functions 
can be modelled, even though their initial specification would be expressed in a 
non-guarded corecursive equation. The method presented in this paper does not 
cover the productive functions that fail to satisfy the guardedness condition ** of 
Section 4. Such functions were studied (e.g.) in [1,26]. Combining these methods 
with the method we have described here can be one of the objectives for the future 
work. 

Many of the steps in the technique we describe here can easily be automated, 
others are very tricky to formulate. In future work, we wish to give a precise 
description of the class of functions for which the technique works, and a precise 
description of each step that is taken in producing the eventually predicate, the 
first recursive component using recursion on the ad-hoc predicate eventually, etc. 
In the end, all the steps could be implemented as a program that takes as input the 
syntax tree describing the function one wants to model and produces both the Coq 
model and the main lemmas about this model. 

Another extension of this work is to study what can be done for functions that 
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are provably totally productive (i.e., productive on every input) but yet not guarded. 
In this case, the final recursive equation should be expressible without proof com- 
ponents. This extension can probably take inspiration from the work done on 
well-founded induction. 
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